This page last changed on May 28, 2008 by chuckp.

How to Fix FASTCopy Incoming

Owned by: Philip Hart 

How to Fix: 
1. Ingest Failure
  • The FC_Incoming script writes to the Oracle tables FCOPY_INCOMING (one row/tarball), untars the incoming file to:
    • /nfs/farm/g/glast//u23/ISOC-flight/Archive/fcopy/yyyy/mm/day.mm.dd.wkd/utchh/hh.mm.ss
  • Under that one sees (for example):
    • LISOC_2008096171823 LISOC_2008096171823.tar LISOC_2008096171823.tar.fastcopy.log
  • The fc sender gets the return code from FC_Incoming as status and resends if needed.
  • A cronjob running under root - cron@glastlnx11 runs flightops-ingest, which writes to:
    • FASTCOPY_RAWARCHIVE, _DATAGRAM, _PACKET, _L0GAP [we calculate this last] tables.
  • It also writes to:
    • /nfs/farm/g/glast/u23/ISOC-flight/Archive/level0/srcmnop/yyyy/.../smnopa[apid]t[something].ICDFILEkey
      For example:
      /nfs/farm/g/glast/u23/ISOC-flight/Archive/level0/src0077/2008/04/108.04.17.Thu/utc22/s0077a0958t1208469600.0000062745
  • Afterwards cron@glastlnx06 runs the fcopy_dispatch job, e.g., launches:
    • ISOC/bin/L0Dispatcher.py which runs (from ISOC/bin)
    • ProcessHSK.py - Trending ingest/limit checking -> (only) Oracle
    • ProcessCMD - logging/some MP reconciliation -> (only) Oracle
    • ProcessSCI -> oracle
      • -> dirs, files for NonEventReporting
      • -> halfpipe -> L1proc
        Note: See /afs/slac/g/glast/isoc/flightOps/offline//halfPipe/v6r0p2/config (<-prod) for control of what it does.
  • L0Dispatcher queries FCOPY_INCOMING, ICDFILE tables for tarballs/files ingested but not dispatched -> FC_L0DISPATCH

1. Ingest Failure 

To Test: Launch FASTCopy Monitoring (FCWebView), select Incoming, and check the Status column.
  • If there is an INGESTFAILED message, you will need to reset the "submitted" flag to "new", so the cron job
    will pick it up and resubmit the job*.*
    • First, hover over the Filename of the failed package.
    • From the status bar at the bottom of the page, copy the the icdfile_pk number
      (e.g., icdfile_pk=63677).
  • From an ISOC environment terminal, you can access the relevant Oracle instance via a wrapper,e.g., 

           rlwrap sqlplus /@isocnightly
    or: 
           ... flight

    Note: For others, see:  $TNS_ADMIN/tnsnames.ora

    Tip:
    • /var/log/flightops/ingest.log on glastlnx11 may provide clues** It may be useful to inspect the table setup via:
             desc fcopy_icdfile
             select * from fcopy_jobstate
  • To Fix:
    • To change the table status, run:

             update fcopy_icdfile set jobstate_fk = 1 where icdfile_pk in (1234, 234, 545)
    • Or, to reingest an entire tarball:

              ... where icdfile pk = 123456789

      Note: To test ingest, send a tarball via FC_send.sh.  (This is unlikely to be needed in production.)




Document generated by Confluence on Aug 21, 2008 10:27